产品描述生成是一项具有挑战性且探索不足的任务。大多数这样的工作都采用一组产品属性,因为输入然后在单个通行证中从头开始生成描述。但是,在面对用户在约束描述时的动态愿望时,这种广泛的范式可能会受到限制,例如根据先前版本删除或添加用户指定属性的内容。为了应对这一挑战,我们在描述生成中探索了一种新的草稿编辑方式,从而导致了电子商务中提议的新任务控制文本编辑。更具体地说,我们允许系统从用户接收命令(删除或添加),然后通过基于上一个版本灵活修改内容来生成描述。通过修改以前的版本而不是从头开始,满足新需求更容易,更实用。此外,我们设计了一种数据增强方法,以纠正此任务中的低资源挑战,其中包含一种基于模型的基于规则的策略,以模仿人类的编辑。为了遵循这项新任务,我们介绍了一个人为编写的命令编辑数据集,称为e-cedits和一个新的指标“属性编辑”。我们的实验结果表明,在自动和人类评估中,使用新的数据增强方法在更大程度上优于基准。
translated by 谷歌翻译
低频词预测仍然是现代神经电机翻译(NMT)系统的挑战。最近的自适应培训方法通过强调整体培训目标的重量来促进不频繁词语的产出。尽管召回了低频词的召回,但它们的预测精度意外地受到自适应目标的阻碍。灵感来自观察到低频词形成更紧凑的嵌入空间,我们从代表学习角度解决这一挑战。具体地,我们提出了一种频率感知的令牌级对比度学习方法,其中每个解码步骤的隐藏状态以基于相应的字频率的柔和对比方式从其他目标单词的对应物推开。我们对广泛使用的NIST汉语 - 英语和WMT14英语 - 德语翻译任务进行实验。经验结果表明,我们的提出方法不仅可以显着提高翻译质量,还可以提高词汇分集和优化词表示空间。进一步调查揭示了,与相关的自适应培训策略相比,我们对低频词预测方法的优势在于在不牺牲精度的情况下在不同频率上的令牌级召回的鲁棒性。
translated by 谷歌翻译
生成的型号推理需要机器生成描述日常情景的句子,这是几种概念,最近引起了很多关注。然而,现有模型不能表现和人类,因为它们产生的句子通常是难以置疑和语法的不正确。在本文中,灵感来自人类创造句子的过程,我们提出了一种新颖的知识增强的致辞生成框架,被称为kgr ^ 4,由四个阶段组成:检索,回顾,精炼,重新思考。在此框架下,我们首先执行检索以搜索从外部语料库作为原型的相关句子。然后,我们训练发电机编辑或复制这些原型以生成候选句子,其中基于AutoEncoder的炼油器将修复候选句子。最后,我们从具有不同超参数的生成器产生的候选句子中选择输出句子。对蒙古基准测试的实验结果和深入分析强烈展示了我们框架的有效性。特别是,KGR ^ 4获得官方排行榜中的33.56个香料点,优于前面报告的最佳结果2.49香料点,实现最先进的性能。
translated by 谷歌翻译
互动和非交互式模型是基于向量的交叉信息检索(V-CLIR)中的两个De-Facto标准框架,其分别以同步和异步方式嵌入查询和文档。从检索准确性和计算效率的角度来看,每个型号都有自己的优越性和缺点。在本文中,我们提出了一种新颖的框架来利用这两个范式的优势。具体地,我们介绍了半交互式机制,它在非交互式架构上构建了我们的模型,但将每个文档与其相关的多语言查询一起编码。因此,可以更好地学习交互式模型的交叉特征。此外,我们通过重用其单词嵌入和采用知识蒸馏来进一步将知识从训练有素的互动模型转移到我们的。我们的模型是从多语言预先训练的语言模型M-BERT初始化的,并在从维基百科和从现实世界搜索引擎收集的内部数据集进行评估。广泛的分析表明,我们的方法在保持计算效率的同时显着提高了检索准确性。
translated by 谷歌翻译
Dataset distillation has emerged as a prominent technique to improve data efficiency when training machine learning models. It encapsulates the knowledge from a large dataset into a smaller synthetic dataset. A model trained on this smaller distilled dataset can attain comparable performance to a model trained on the original training dataset. However, the existing dataset distillation techniques mainly aim at achieving the best trade-off between resource usage efficiency and model utility. The security risks stemming from them have not been explored. This study performs the first backdoor attack against the models trained on the data distilled by dataset distillation models in the image domain. Concretely, we inject triggers into the synthetic data during the distillation procedure rather than during the model training stage, where all previous attacks are performed. We propose two types of backdoor attacks, namely NAIVEATTACK and DOORPING. NAIVEATTACK simply adds triggers to the raw data at the initial distillation phase, while DOORPING iteratively updates the triggers during the entire distillation procedure. We conduct extensive evaluations on multiple datasets, architectures, and dataset distillation techniques. Empirical evaluation shows that NAIVEATTACK achieves decent attack success rate (ASR) scores in some cases, while DOORPING reaches higher ASR scores (close to 1.0) in all cases. Furthermore, we conduct a comprehensive ablation study to analyze the factors that may affect the attack performance. Finally, we evaluate multiple defense mechanisms against our backdoor attacks and show that our attacks can practically circumvent these defense mechanisms.
translated by 谷歌翻译
Blind image quality assessment (BIQA) remains challenging due to the diversity of distortion and image content variation, which complicate the distortion patterns crossing different scales and aggravate the difficulty of the regression problem for BIQA. However, existing BIQA methods often fail to consider multi-scale distortion patterns and image content, and little research has been done on learning strategies to make the regression model produce better performance. In this paper, we propose a simple yet effective Progressive Multi-Task Image Quality Assessment (PMT-IQA) model, which contains a multi-scale feature extraction module (MS) and a progressive multi-task learning module (PMT), to help the model learn complex distortion patterns and better optimize the regression issue to align with the law of human learning process from easy to hard. To verify the effectiveness of the proposed PMT-IQA model, we conduct experiments on four widely used public datasets, and the experimental results indicate that the performance of PMT-IQA is superior to the comparison approaches, and both MS and PMT modules improve the model's performance.
translated by 谷歌翻译
The development of social media user stance detection and bot detection methods rely heavily on large-scale and high-quality benchmarks. However, in addition to low annotation quality, existing benchmarks generally have incomplete user relationships, suppressing graph-based account detection research. To address these issues, we propose a Multi-Relational Graph-Based Twitter Account Detection Benchmark (MGTAB), the first standardized graph-based benchmark for account detection. To our knowledge, MGTAB was built based on the largest original data in the field, with over 1.55 million users and 130 million tweets. MGTAB contains 10,199 expert-annotated users and 7 types of relationships, ensuring high-quality annotation and diversified relations. In MGTAB, we extracted the 20 user property features with the greatest information gain and user tweet features as the user features. In addition, we performed a thorough evaluation of MGTAB and other public datasets. Our experiments found that graph-based approaches are generally more effective than feature-based approaches and perform better when introducing multiple relations. By analyzing experiment results, we identify effective approaches for account detection and provide potential future research directions in this field. Our benchmark and standardized evaluation procedures are freely available at: https://github.com/GraphDetec/MGTAB.
translated by 谷歌翻译
Given the increasingly intricate forms of partial differential equations (PDEs) in physics and related fields, computationally solving PDEs without analytic solutions inevitably suffers from the trade-off between accuracy and efficiency. Recent advances in neural operators, a kind of mesh-independent neural-network-based PDE solvers, have suggested the dawn of overcoming this challenge. In this emerging direction, Koopman neural operator (KNO) is a representative demonstration and outperforms other state-of-the-art alternatives in terms of accuracy and efficiency. Here we present KoopmanLab, a self-contained and user-friendly PyTorch module of the Koopman neural operator family for solving partial differential equations. Beyond the original version of KNO, we develop multiple new variants of KNO based on different neural network architectures to improve the general applicability of our module. These variants are validated by mesh-independent and long-term prediction experiments implemented on representative PDEs (e.g., the Navier-Stokes equation and the Bateman-Burgers equation) and ERA5 (i.e., one of the largest high-resolution data sets of global-scale climate fields). These demonstrations suggest the potential of KoopmanLab to be considered in diverse applications of partial differential equations.
translated by 谷歌翻译
A recent study has shown a phenomenon called neural collapse in that the within-class means of features and the classifier weight vectors converge to the vertices of a simplex equiangular tight frame at the terminal phase of training for classification. In this paper, we explore the corresponding structures of the last-layer feature centers and classifiers in semantic segmentation. Based on our empirical and theoretical analysis, we point out that semantic segmentation naturally brings contextual correlation and imbalanced distribution among classes, which breaks the equiangular and maximally separated structure of neural collapse for both feature centers and classifiers. However, such a symmetric structure is beneficial to discrimination for the minor classes. To preserve these advantages, we introduce a regularizer on feature centers to encourage the network to learn features closer to the appealing structure in imbalanced semantic segmentation. Experimental results show that our method can bring significant improvements on both 2D and 3D semantic segmentation benchmarks. Moreover, our method ranks 1st and sets a new record (+6.8% mIoU) on the ScanNet200 test leaderboard. Code will be available at https://github.com/dvlab-research/Imbalanced-Learning.
translated by 谷歌翻译
Benefiting from the intrinsic supervision information exploitation capability, contrastive learning has achieved promising performance in the field of deep graph clustering recently. However, we observe that two drawbacks of the positive and negative sample construction mechanisms limit the performance of existing algorithms from further improvement. 1) The quality of positive samples heavily depends on the carefully designed data augmentations, while inappropriate data augmentations would easily lead to the semantic drift and indiscriminative positive samples. 2) The constructed negative samples are not reliable for ignoring important clustering information. To solve these problems, we propose a Cluster-guided Contrastive deep Graph Clustering network (CCGC) by mining the intrinsic supervision information in the high-confidence clustering results. Specifically, instead of conducting complex node or edge perturbation, we construct two views of the graph by designing special Siamese encoders whose weights are not shared between the sibling sub-networks. Then, guided by the high-confidence clustering information, we carefully select and construct the positive samples from the same high-confidence cluster in two views. Moreover, to construct semantic meaningful negative sample pairs, we regard the centers of different high-confidence clusters as negative samples, thus improving the discriminative capability and reliability of the constructed sample pairs. Lastly, we design an objective function to pull close the samples from the same cluster while pushing away those from other clusters by maximizing and minimizing the cross-view cosine similarity between positive and negative samples. Extensive experimental results on six datasets demonstrate the effectiveness of CCGC compared with the existing state-of-the-art algorithms.
translated by 谷歌翻译